AITopics | miss rate

Collaborating Authors

miss rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GraphReorderingforCache-EfficientNearNeighbor Search

Neural Information Processing SystemsFeb-13-2026, 00:55:03 GMT

Graph search is one of the most successful algorithmic trends in near neighbor search. Severalofthemostpopular andempirically successful algorithms are,at their core, a greedy walk along a pruned near neighbor graph.

artificial intelligence, graph, information management, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology:

Information Technology > Information Management > Search (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Add feedback

X-SYCON: Xylem-Inspired Passive Gradient Control for Communication-Free Swarm Response in Dynamic Disaster Environments

Baek, Arthur Ji Sung, Martin, Geoffrey

arXiv.org Artificial IntelligenceDec-2-2025

We present X-SYCON, a xylem-inspired multi-agent architecture in which coordination emerges from passive field dynamics rather than explicit planning or communication. Incidents (demands) and obstructions (hazards) continually write diffusing and decaying scalar fields, and agents greedily ascend a local utility $U=ϕ_{\mathrm{DE}}-κ\,ϕ_{\mathrm{HZ}}$ with light anti-congestion and separation. A beaconing rule triggered on first contact temporarily deepens the local demand sink, accelerating completion without reducing time-to-first-response. Across dynamic, partially blocked simulated environments, we observe low miss rates and stable throughput with interpretable, tunable trade-offs over carrier count, arrival rate, hazard density, and hazard sensitivity $κ$. We derive that a characteristic hydraulic length scale $\ell\approx\sqrt{D/λ}$ predicts recruitment range in a continuum approximation, and we provide a work-conservation (Ohm-law) bound consistent with sublinear capacity scaling with team size. Empirically: (i) soft hazard penalties yield fewer misses when obstacles already block motion; (ii) throughput saturates sublinearly with carriers while reliability improves sharply; (iii) stronger arrivals can reduce misses by sustaining sinks that recruit help; and (iv) phase-stability regions shrink with hazard density but are recovered by more carriers or higher arrivals. We refer to X-SYCON as an instance of Distributed Passive Computation and Control, and we evaluate it in simulations modeling communication-denied disaster response and other constrained sensing-action regimes.

artificial intelligence, decay, x-sycon, (18 more...)

arXiv.org Artificial Intelligence

2512.00018

Country:

North America > United States (0.68)
Europe > United Kingdom > England (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

From Proof to Program: Characterizing Tool-Induced Reasoning Hallucinations in Large Language Models

Bayat, Farima Fatahi, Pezeshkpour, Pouya, Hruschka, Estevam

arXiv.org Artificial IntelligenceNov-17-2025

Tool-augmented Language Models (TaLMs) can invoke external tools to solve problems beyond their parametric capacity. However, it remains unclear whether these tool-enabled gains reflect trustworthy reasoning. Focusing on the Code Interpreter tool, we show that even when tools are selected and executed correctly, TaLMs treat tool outputs as substitutes for reasoning, producing solutions that appear correct but lack coherent justification. We term this failure mode Tool-Induced Myopia (TIM), and study it using PYMATH, a benchmark of 1,679 competition-level mathematical problems for which Python code is helpful but not sufficient. We further develop a multi-dimensional evaluation suite to quantify reasoning degradation in TaLMs relative to their non-tool counterparts. Our findings reveal that while TaLMs achieve up to a 19.3 percentage point gain in final-answer accuracy, their reasoning behavior consistently deteriorates (e.g., non-tool LLMs win up to 41.5% more often in pairwise comparisons of the reasoning process). This degradation intensifies with tool use; the more frequently a model invokes tools, the less coherent its reasoning becomes. Moreover, tool use shifts errors from arithmetic mistakes toward global reasoning failures (logic, assumption, creativity); with TIM present in ~55% of high-risk cases. Finally, we propose a preference-optimization-based framework that realigns TaLMs to use tools as assistive evidence, improving both final-answer accuracy and reasoning depth under tool use. Codes and data are available at: https://github.com/megagonlabs/TIM.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.10899

Country:

Asia (0.68)
North America > United States (0.46)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Are Large Language Models a Good Replacement of Taxonomies?

Sun, Yushi, Xin, Hao, Sun, Kai, Xu, Yifan Ethan, Yang, Xiao, Dong, Xin Luna, Tang, Nan, Chen, Lei

arXiv.org Artificial IntelligenceJun-20-2024

Large language models (LLMs) demonstrate an impressive ability to internalize knowledge and answer natural language questions. Although previous studies validate that LLMs perform well on general knowledge while presenting poor performance on long-tail nuanced knowledge, the community is still doubtful about whether the traditional knowledge graphs should be replaced by LLMs. In this paper, we ask if the schema of knowledge graph (i.e., taxonomy) is made obsolete by LLMs. Intuitively, LLMs should perform well on common taxonomies and at taxonomy levels that are common to people. Unfortunately, there lacks a comprehensive benchmark that evaluates the LLMs over a wide range of taxonomies from common to specialized domains and at levels from root to leaf so that we can draw a confident conclusion. To narrow the research gap, we constructed a novel taxonomy hierarchical structure discovery benchmark named TaxoGlimpse to evaluate the performance of LLMs over taxonomies. TaxoGlimpse covers ten representative taxonomies from common to specialized domains with in-depth experiments of different levels of entities in this taxonomy from root to leaf. Our comprehensive experiments of eighteen state-of-the-art LLMs under three prompting settings validate that LLMs can still not well capture the knowledge of specialized taxonomies and leaf-level entities.

knowledge, llm, taxonomy, (14 more...)

arXiv.org Artificial Intelligence

2406.11131

Country:

North America > United States (0.46)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Services (0.69)
Health & Medicine > Public Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Evaluating LLMs for Gender Disparities in Notable Persons

Rhue, Lauren, Goethals, Sofie, Sundararajan, Arun

arXiv.org Artificial IntelligenceMar-14-2024

This study examines the use of Large Language Models (LLMs) for retrieving factual information, addressing concerns over their propensity to produce factually incorrect "hallucinated" responses or to altogether decline to even answer prompt at all. Specifically, it investigates the presence of gender-based biases in LLMs' responses to factual inquiries. This paper takes a multi-pronged approach to evaluating GPT models by evaluating fairness across multiple dimensions of recall, hallucinations and declinations. Our findings reveal discernible gender disparities in the responses generated by GPT-3.5. While advancements in GPT-4 have led to improvements in performance, they have not fully eradicated these gender disparities, notably in instances where responses are declined. The study further explores the origins of these disparities by examining the influence of gender associations in prompts and the homogeneity in the responses.

gender disparity, gpt-3, hallucination, (10 more...)

arXiv.org Artificial Intelligence

2403.09148

Country:

North America > United States > New York (0.04)
North America > United States > Maryland (0.04)
Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)
Personal > Honors > Award (0.49)

Industry:

Leisure & Entertainment (0.94)
Media > Film (0.68)
Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

t-EER: Parameter-Free Tandem Evaluation of Countermeasures and Biometric Comparators

Kinnunen, Tomi, Lee, Kong Aik, Tak, Hemlata, Evans, Nicholas, Nautsch, Andreas

arXiv.org Artificial IntelligenceSep-21-2023

Presentation attack (spoofing) detection (PAD) typically operates alongside biometric verification to improve reliablity in the face of spoofing attacks. Even though the two sub-systems operate in tandem to solve the single task of reliable biometric verification, they address different detection tasks and are hence typically evaluated separately. Evidence shows that this approach is suboptimal. We introduce a new metric for the joint evaluation of PAD solutions operating in situ with biometric verification. In contrast to the tandem detection cost function proposed recently, the new tandem equal error rate (t-EER) is parameter free. The combination of two classifiers nonetheless leads to a \emph{set} of operating points at which false alarm and miss rates are equal and also dependent upon the prevalence of attacks. We therefore introduce the \emph{concurrent} t-EER, a unique operating point which is invariable to the prevalence of attacks. Using both modality (and even application) agnostic simulated scores, as well as real scores for a voice biometrics application, we demonstrate application of the t-EER to a wide range of biometric system evaluations under attack. The proposed approach is a strong candidate metric for the tandem evaluation of PAD systems and biometric comparators.

asv, false alarm rate, threshold, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TPAMI.2023.3313648

2309.12237

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Singapore (0.05)
Europe > France (0.04)
(5 more...)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Add feedback

LMR: Lane Distance-Based Metric for Trajectory Prediction

Schmidt, Julian, Monninger, Thomas, Jordan, Julian, Dietmayer, Klaus

arXiv.org Artificial IntelligenceApr-13-2023

The development of approaches for trajectory prediction requires metrics to validate and compare their performance. Currently established metrics are based on Euclidean distance, which means that errors are weighted equally in all directions. Euclidean metrics are insufficient for structured environments like roads, since they do not properly capture the agent's intent relative to the underlying lane. In order to provide a reasonable assessment of trajectory prediction approaches with regard to the downstream planning task, we propose a new metric that is lane distance-based: Lane Miss Rate (LMR). For the calculation of LMR, the ground-truth and predicted endpoints are assigned to lane segments, more precisely their centerlines. Measured by the distance along the lane segments, predictions that are within a certain threshold distance to the ground-truth count as hits, otherwise they count as misses. LMR is then defined as the ratio of sequences that yield a miss. Our results on three state-of-the-art trajectory prediction models show that LMR preserves the order of Euclidean distance-based metrics. In contrast to the Euclidean Miss Rate, qualitative results show that LMR yields misses for sequences where predictions are located on wrong lanes. Hits on the other hand result for sequences where predictions are located on the correct lane. This means that LMR implicitly weights Euclidean error relative to the lane and goes into the direction of capturing intents of traffic agents. The source code of LMR for Argoverse 2 is publicly available.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2304.05869

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > California > Santa Clara County > Sunnyvale (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Adaptive Caching by Refetching

Neural Information Processing SystemsApr-6-2023, 16:17:36 GMT

We are constructing caching policies that have 13-20% lower miss rates than the best of twelve baseline policies over a large variety of request streams. This represents an improvement of 49–63% over Least Recently Used, the most commonly implemented policy. We achieve this not by designing a specific new policy but by using on-line Machine Learning algorithms to dynamically shift between the standard policies based on their observed miss rates. A thorough experimental evaluation of our techniques is given, as well as a discussion of what makes caching an interesting on-line learning problem.

adaptive caching, miss rate, refetching

Neural Information Processing Systems

Genre: Instructional Material > Online (0.70)

Industry: Education > Focused Education > Special Education (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Artificial intelligence excels at catching pre-cancerous cells

#artificialintelligenceApr-20-2022, 15:40:29 GMT

Authors of a new study on detecting precancerous polyps in colorectal cancer screening came to the conclusion that "Artificial Intelligence (AI) may detect colorectal polyps that have been missed due to perceptual pitfalls." They go on to say "By reducing such miss rate, Artificial Intelligence may increase the detection of colorectal neoplasia leading to a higher degree of Colorectal Cancer (CRC) prevention." According to a news release, a team of international researchers led by Mayo Clinic reported that AI "reduced by twofold the rate at which pre-cancerous polyps were missed in colorectal cancer screening." The Mayo Clinic defines a colon polyp as "a small clump of cells that forms on the lining of the colon" and says most are harmless. Yet, it cautions, "over time, some colon polyps can develop into colon cancer, which may be fatal when found in its later stages."

colonoscopy, colorectal cancer screening, polyp, (8 more...)

#artificialintelligence

Country:

North America > United States > Florida > Duval County > Jacksonville (0.06)
Europe > Italy (0.06)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.06)

Genre: Research Report (0.55)

Industry: Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

AI reduces miss rate of precancerous polyps in colorectal cancer screening

#artificialintelligenceApr-20-2022, 02:32:10 GMT

Most colon polyps are harmless, but some over time develop into colon or rectal cancer, which can be fatal if found in its later stages. Colorectal cancer is the second most deadly cancer in the world, with an estimated 1.9 million cases and 916,000 deaths worldwide in 2020, according to the World Health Organization. A colonoscopy is an exam used to detect changes or abnormalities in the large intestine (colon) and rectum. Between February 2020 and May 2021, 230 study participants each underwent two back-to-back colonoscopies on the same day at eight hospitals and community clinics in the U.S., U.K. and Italy. One colonoscopy used AI; the other, a standard colonoscopy, did not. The rate at which precancerous colorectal polyps is missed has been estimated to be 25%.

jacksonville, mayo clinic, mayo clinic health system, (8 more...)

#artificialintelligence

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.19)
North America > United States > Florida > Duval County > Jacksonville (0.14)
North America > United States > Wisconsin > La Crosse County > La Crosse (0.10)
(4 more...)

Genre: Research Report > New Finding (0.78)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (1.00)

Technology: Information Technology > Artificial Intelligence > Applied AI (0.37)

Add feedback